Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

return better structured error logs to connector builder #46963

Merged
merged 8 commits into from
Oct 23, 2024

Conversation

lmossman
Copy link
Contributor

@lmossman lmossman commented Oct 17, 2024

What

Partially resolves https://github.com/airbytehq/airbyte-internal-issues/issues/6977

This is the first of 3 PRs to have better structured error messages in the Connector Builder.
See the other 2 PRs that build on top of this one:

How

This PR changes how the connector_builder module of the CDK returns trace messages back to the connector builder server in a few key ways:

  • Breaks apart the error message and stacktrace into separate fields
  • Adds the internal_message of the trace message to another field in the returned Log
  • Suppresses the final raised AirbyteTracedException which is only meant to cause a non-zero return value to result in a sync failure, but does not provide any useful information beyond the other traced exceptions that are already yielded

Testing

Follow the Testing instructions in the description of this PR to test this end-to-end:

Can this PR be safely reverted and rolled back?

  • YES 💚
  • NO ❌

Copy link

vercel bot commented Oct 17, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Oct 23, 2024 8:06pm

@octavia-squidington-iii octavia-squidington-iii added the CDK Connector Development Kit label Oct 17, 2024
@lmossman lmossman changed the title return better structured error logs, and don't return raised trace ex… return better structured error logs, and don't return raised trace exception Oct 18, 2024
@lmossman lmossman changed the title return better structured error logs, and don't return raised trace exception return better structured error logs to connector builder Oct 18, 2024
Comment on lines +48 to +51
try:
return response.content.decode("utf-8")
except Exception:
return None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change allows this method to also properly handle response bodies which are just string values, since that currently causes a JSONDecodeError

@lmossman lmossman marked this pull request as ready for review October 18, 2024 19:08
Copy link
Contributor

@natikgadzhi natikgadzhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code change itself seems perfectly fine. This will have longer-term compatibility concern (log error message model should map to the protocol better), but this can be solved later. Left a comment on that.

@@ -39,6 +39,8 @@ class StreamReadSlices:
class LogMessage:
message: str
level: str
internal_message: Optional[str] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trade off here is that since this does not map to the protocol error message model, when we move to using source-declarative-manifest container as manifest runner for Builder, we'd need to make sure that the protocol has these fields, otherwise error reporting will break.

Should we make the work and align these models in this file with protocol-level models now instead? Or ship the small improvement and record a todo for later?
/cc @bnchrch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a good callout Natik!

Looking into it, this does already align with the protocol model AirbyteTraceMessage

https://github.com/airbytehq/airbyte-protocol/blob/e24ee151574a7f0f4e997b90063d7c9ffed2b158/protocol-models/src/main/resources/airbyte_protocol/v0/airbyte_protocol.yaml#L256

@lmossman can you confirm Im reading this as this is already a protocol blessed type?

If so should LogMessage become a union type of AirbyteTraceMessage and AirbyteLogMessage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trade off here is that since this does not map to the protocol error message model, when we move to using source-declarative-manifest container as manifest runner for Builder, we'd need to make sure that the protocol has these fields, otherwise error reporting will break.

This LogMessage class is only used in the message_grouper logic which wraps a call to the read protocol method call.

Even if we swap out the runtime to use source-declarative-manifest to execute the read, we will still need all of this extra message_grouper logic around it to create the StreamRead output object that the connector builder server expects, and it shouldn't need to be changed because the get_message_groups function does operate on the airbyte protocol already, and is just normalizing those airbyte protocol messages into a form that is easier understood by the connector builder server.

If we don't want to directly execute any python anymore at all and therefore delete this message_grouper logic, then the connector builder server will require a bunch of other changes anyway to do similar logic to what that python code is doing now.

So, I think I'd prefer to keep this PR simple right now, especially because I don't think we can easily reference the protocol AirbyteLogMessage and AirbyteTraceMessage in our connector-builder-server and airbyte-server openapi specs, and I'd rather not repeat those entire schemas as they are more complicated than the LogMessage schema I've defined here.

But let me know if anything I said above sounds off @natikgadzhi @bnchrch !

Copy link
Contributor

@bnchrch bnchrch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two Questions! Should pass once those are addressed

# is that this message will be shown in the Builder.
if (
traced_exception.message is not None
and "During the sync, the following streams did not sync successfully" in traced_exception.message
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ Can we modify AbstractSource to indicate its a "Final Exception"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnchrch Do you have a specific approach in mind? We could set internal_message to something for example to indicate this but this would still rely on comparing strings in that approach.

I'm wary to raise a different exception in abstract_source here because I'm not sure what relies on this being an AirbyteTracedException and I don't feel familiar enough with that fairly central CDK code to know what is safe to change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also add that any raised AirbyteTracedException is sort of already implied to be the final exception. Within abstract_source.py, https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/airbyte_cdk/sources/abstract_source.py#L171-L177 , during a read of a stream always try to catch any form of Exception or AirbyteTracedException thrown by a stream and wrap it into an emitted trace message. We only formally raise the "final" AirbyteTracedException because we need to end the process with a non-zero error code.

Granted the small gap is if we get an exception outside of that try/except block. But from my perspective, us trying to adjust the protocol or create new abstractions to capture what is effectively a work around feels like we're trying to over engineer a bit. The ideal design is that the platform can identify that merely receiving any AirbyteTracedException is enough to know that the sync was unsuccessful. And then we could get rid of this work around.

So TLDR: I think how we do it here is fine.

@@ -39,6 +39,8 @@ class StreamReadSlices:
class LogMessage:
message: str
level: str
internal_message: Optional[str] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a good callout Natik!

Looking into it, this does already align with the protocol model AirbyteTraceMessage

https://github.com/airbytehq/airbyte-protocol/blob/e24ee151574a7f0f4e997b90063d7c9ffed2b158/protocol-models/src/main/resources/airbyte_protocol/v0/airbyte_protocol.yaml#L256

@lmossman can you confirm Im reading this as this is already a protocol blessed type?

If so should LogMessage become a union type of AirbyteTraceMessage and AirbyteLogMessage

@lmossman lmossman force-pushed the lmossman/better-builder-errors branch from a93d329 to 0200577 Compare October 19, 2024 00:36
Copy link
Contributor

@brianjlai brianjlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know there are other questions of the marketplace side, but not concerns by me

# is that this message will be shown in the Builder.
if (
traced_exception.message is not None
and "During the sync, the following streams did not sync successfully" in traced_exception.message
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also add that any raised AirbyteTracedException is sort of already implied to be the final exception. Within abstract_source.py, https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/airbyte_cdk/sources/abstract_source.py#L171-L177 , during a read of a stream always try to catch any form of Exception or AirbyteTracedException thrown by a stream and wrap it into an emitted trace message. We only formally raise the "final" AirbyteTracedException because we need to end the process with a non-zero error code.

Granted the small gap is if we get an exception outside of that try/except block. But from my perspective, us trying to adjust the protocol or create new abstractions to capture what is effectively a work around feels like we're trying to over engineer a bit. The ideal design is that the platform can identify that merely receiving any AirbyteTracedException is enough to know that the sync was unsuccessful. And then we could get rid of this work around.

So TLDR: I think how we do it here is fine.

@lmossman
Copy link
Contributor Author

lmossman commented Oct 23, 2024

/approve-regression-tests

Only changes the connector_builder wrapper around airbyte-cdk, so shouldn't affect any actual connectors. I have tested this locally against the connector builder (along with the platform changes linked in the PRs in the description)

Check job output.

✅ Approving regression tests

@lmossman
Copy link
Contributor Author

lmossman commented Oct 23, 2024

/approve-regression-tests

Only changes the connector_builder wrapper around airbyte-cdk, so shouldn't affect any actual connectors. I have tested this locally against the connector builder (along with the platform changes linked in the PRs in the description)

Check job output.

✅ Approving regression tests

@lmossman lmossman merged commit b590a21 into master Oct 23, 2024
35 checks passed
@lmossman lmossman deleted the lmossman/better-builder-errors branch October 23, 2024 22:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDK Connector Development Kit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants